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Abstract 

We develop a new class of overlapping Schwarz type algorithms for solving scalar convection- 
diffusion equations discretized by finite element or finite difference methods. The precon- 
ditioners consist of two components, namely, the usual two-level additive Schwarz precon- 
ditioner and the sum of some quadratic terms constructed by using products of ordered 
neighboring subdomain preconditioners. The ordering of the subdomain preconditioners is 
determined by considering the direction of the flow. We prove that the algorithms are opti- 
mal in the sense that the convergence rates are independent of the mesh size, as well as the 
number of subdomains. We show by numerical examples that the new algorithms are less 
sensitive to the direction of the flow than either the classical multiplicative Schwarz algo- 
rithms, and converge faster than the additive Schwarz algorithms. Thus, the new algorithms 
are more suitable for fluid flow 7 applications than the classical additive or multiplicative 
Schwarz algorithms. 
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1. Introduction. In this paper, we present some new overlapping domain decompo- 
sition methods for the numerical solution of large, sparse, nonsymmetric and/or indefinite 
linear systems of equations arising from Galerkin finite element discretizations of elliptic par- 
tial differential equations. The new algorithms belong to the family of overlapping Schwarz 
methods which is a. variant of the classical Schwarz alternating algorithm, introduced in 1870 
by H. A. Schwarz [22] in an existence proof of elliptic boundary value problems defined in 
certain irregular regions. This family of methods has attracted much attention in the past 
few vears as convenient and powerful in the solution of partial differential equations, see, 
especially on parallel machines. For tutorial presentations, see, e.g. [7,23]. 

We shall focus on linear nonsymmetric and/or indefinite second-order elliptic finite ele- 
ment or finite difference equations. The solution of such problems is an important compu- 
tational kernel in implicit methods, such as solving systems involving the Jacobian in any 
Newton-like method used in computational fluid dynamics, [3, 4, 24]. This family of methods 
is built upon the so-called subdomain mapping operators X, which solve the original prob- 
lem, defined on a domain ft, approximately in subdomains ft, C 0 with artificial boundary 
conditions and zero extensions to ft - ft'. The formal definitions of X and ft' will be given in 
the next section. By using these X’s as basic building blocks, a family of polynomial Schwarz 
algorithms can be defined. Let A be the number of subdomains and To the coarse space 
mapping operator. We define 

T = poly(T 0 , 7i, • • • ,Xn) 

as a multi-dimensional matrix-valued polynomial with variables X,, and assume that the 
polynomial satisfies poly(Q , • • • .0) = 0, which simply means that the constant term in the 
polynomial is zero. It is known that if u* is the exact solution of the finite element equations 
then Tu * can be computed without knowing u* itself. This is because T,u , i — 0, • • • , A , 
can be computed directly from the right-hand side function of the finite element equations. 
With g = Tu * as the new right-hand side vector, a new linear system can be introduced as 

Tu = g 

and it is not difficult to show that if T is nonsingular then the new linear system gives 
the desired finite element solution u*. For each choice of the polynomial poly, a particular 
Schwarz algorithm is defined. The algorithm is called optimal if the condition number, or 
some other “equivalent measure” for nonsymmetric or indefinite problems, of the operator 
T is independent of the mesh parameter h and the number of subdomains N. Several such 
optimal algorithms, such as the additive (T = Hflo X) and multiplicative (X = / — n i=0 (/ — 
Ti)) Schwarz algorithms, have been identified. Generally, the additive algorithms have two 
features among others: 

• They converge more slowly than the multiplicative algorithms because of the lack 
of subdomain-to-subdomain communications within each iteration; 

• Their convergence is independent of the ordering and coloring of the subdomains. 
The features of the multiplicative algorithms include: 

• They are faster in terms of the total iteration number; 
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• They are not as parallel as the additive algorithms because of the data dependence 
between overlapping subdomains 

• They have a strong dependence on the global ordering and coloring of subdomains 
especially for convection-diffusion problems. 

See the last section of this paper for a detailed discussion on the ordering and coloring 
issues. To use the multiplicative algorithms efficiently, it is important to color and order 
the subdomains correctly. However, to obtain the optimal coloring and ordering is difficult 
in practice especially when the underlying mesh is unstructured and the subdomains are 
obtained by means of graph partitioning, see, e.g., [3, 10]. For a particular problem and a 
given subdomain partitioning, it is not impossible to obtain a reasonable subdomain coloring 
and ordering according to certain practical heuristics, but, in general, especially for unsteady 
problems where the flow direction changes from time step to time step, it becomes desirable 
to have algorithms that do not need, or depend less on, manual subdomain ordering and 
coloring. Extensive discussions on the effects of ordering and coloring of nodes or elements, 
in the context of iterative and direct sparse matrix computations, can be found in many 
research papers, see, e.g., [1, 9, 15, 17]. Some of the ideas and techniques can also be 
applied, with certain modifications, to the coloring and ordering of overlapping subdomains. 
We will not consider these techniques in this paper, since our interest is in automating the 
construction of the preconditioner. 

In this paper, we shall identify some overlapping Schwarz algorithms, which we call the 
local multiplicative Schwarz methods. The new algorithms are not only optimal but also 
have convergence rates that are: 

• better than that of the additive Schwarz method; 

• not sensitive to the coloring and global ordering of the subdomains, nor the flow 
direction; 

• more parallel than the multiplicative Schwarz algorithm. 

Our basic idea is to use the multiplicative Schwarz algorithm only locally between those 
pairs of overlapping subdomains for which we have effective techniques to determine the 
flow direction without any global operations. We use additive techniques to handle the 
global communication between pairs of subdomains and the coarse level preconditioning. 

The paper is organized as follows. In Section 2, we define our model elliptic problem, its 
discretization and the overlapping partitioning of the finite element mesh. In Section 3, we 
introduce and analyze the new local multiplicative Schwarz algorithms for symmetric and 
general nonsymmetric problems. In the last section of the paper we provide some numerical 
examples regarding the performance of the new algorithms, as well as some comparisons 
with the classical additive and multiplicative Schwarz algorithms. 

2. Model problems and subdomain partitioning. Let ft be an open, bounded 
polygonal region in R d ,d = 2 or 3, with boundary dft. We consider the homogeneous 
Dirichlet boundary value problem 


Lu(x) = f(x) in ft, 
u(x) = 0 on dfl. 
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Here the elliptic operator L has the form Lu(x) = -V • (Vu) + 2/3(x) • Vu + c(x)u. All the 
coefficients are, by assumption, sufficiently smooth and the right-hand side / € L 2 {fl). We 
assume that the equation has a unique solution in Hq{CI). Let (-,-) denote the usual L 2 (0) 
inner product and || • || or || • the corresponding norm. The weak form of equation (1) is: 
Find u G Hq(Q) such that 

(2) b(u, v ) = (/, v), Vv € Hq(Q). 

The bilinear form b(u,v) is defined by 

b(u,v) = J V« • Vvdx + j J$-Vu)vdx + j ^ V • (0u)vdx + j Jcuv)dx. 

Here c(x) = c(x) In addition to the following bilinear form 

a(u,u) = 

which is used as the usual energy inner product in Hq(Q) with norm defined by ||u|| 0 = 
(a(u,u)) 1/2 , we also use two other bilinear forms 

s(u,v) = j (/? ‘Vu)vdx + /qV ' ((3u)vdx 

and c(u, v) = (cu, v), which correspond to the skew-symmetric and zeroth order parts of L , 
respectively. It is easy to verify that 

s(u,v) - -s(v,u), Vu,ue/fo(^)- 

Following Dryja and Widlund [13], we define a two-level conforming finite element trian- 
gulation of f2. The region ft is first divided into nonoverlapping subdomains fy, i = 1, • • • , N, 
such that H = U=i Then a11 the subdomains 0„ which are assumed to have diameter 
of order H , are divided into triangular elements of size h. We assume that the union of all 
of the elements of size h , forms a regular finite element triangulation of fi. The common 
assumption, in finite element theory (cf. [8]), that all elements are shape regular is adopted. 
With such a triangulation, we let V h C H 2 {U) be the usual piecewise linear continuous finite 
element space on 0. To obtain an overlapping decomposition of the domain, we extend 
each subdomain H, to a larger region i.e. C 0' C 0. We assume that the over- 
lap is uniformly large and let V { = C V h be the usual finite element subspace 

defined over fl' z . with zero extension to H — Here uniformly large overlap means that 
distance(dfL fl'fi , Dili nQ) > cH, where c > 0 is a constant independent of H. It is clear 

that (l = L t an( i H = Vo + Vi -1 + Vff. 

The finite element discretization of (2) reads as follows: Find u‘ € V h such that 

(3) b(u*,v) = {f, v ), VueVfc. 

Another key ingredient in the design of optimal domain decomposition preconditioners 
is the use of at least one global coarse space, which in a way connects the local subdomains 
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just introduced. A number of coarse spaces have been introduced in the literature, see, e.g., 
[11, 12], We shall focus only on a simple one. Let fl# = {r,} be a quasi-uniform triangulation 
of D and r, one of the triangles with a diameter on the order of H. f Ih is the coarse grid. 
Let Vo be the piecewise linear continuous finite element space on f Ih- In the analysis part 
of this paper we assume, for simplicity, that Vo C Vhi and that the diameter of the coarse 
elements t; is of the same order as the diameter of the subdomains f2 ,. The theory can easily 
be extended to the case of a non-nested coarse [2], and to cases with small overlap [14]. 

In the numerical experiments section, we shall present some cases where the sizes of the 
subdomains and the coarse elements are of different order. 

For each i = 0, 1, . . . , N, we define a mapping operator Tj : V4 — » V by 

(4) b(TiU,v) = b(u,v), Vu € V4, Vu £ V . 

These T, will serve as the basic building blocks of the algorithms to be discussed in the 
next sections. We shall mention that these T.’s can also be defined inexactly if we replace 
the left-hand side bilinear form in (4) by a different bilinear form, which, in some sense is 
equivalent to &(•,•). Details on inexact Schwarz algorithms can be found in, for example, 
[5, 7, 23], 

3. New algorithms and analysis. In this section, we define the local multiplicative 
Schwarz algorithms by using the basic Schwarz building blocks Ti defined in the previous 
section. For each pair of neighboring subdomains, with indices i and j, we define a multi- 
plicative Schwarz operator 

Pii = I ~(I- Tj){I ~ T,). 

Note that for any u £ V h , P lJ u £ V, + Vj, and generally P tJ ^ P Jt , unless O' and ft' have no 
common points. Let 

(5) P = To + ^2 P,j, 

where the summation is taken over all possible P, : 's. Let g tJ = Piju * and go — T 0 u *; as 
mentioned earlier, both can be computed without the knowledge of it*. With g = go + H 9iji 
it can be seen that if the operator P is nonsingular, then the linear system 

(6) Pu = g 

has the same solution as that of (3). We shall prove in the remainder of the paper that P 
is indeed nonsingular and uniformly well-conditioned, and that therefore (6) can be solved 
by using certain Krylov space type iterative acceleration methods, such as CG or GMRES 
[21]. We remark that if the bilinear form £>(•,•) is symmetric, then the operator P is also 
symmetric with respect to &(•, •). In other words, the local multiplicative Schw r arz operator P 
is symmetric if both Pij and Pji are included in its definition. Later, in this section, we shall 
intentionally destroy the symmetry by dropping one of the two terms when solving nonsym- 
metric problems. Keeping only the terms in the upwind direction makes the algorithm very 
useful for convection-diffusion equations. Like other upwinding type discretization schemes, 
we shall also introduce a parameter g that controls the amount of the upwinding, or artificial 
diffusion, in the Schwarz preconditioning polynomial. 

4 



3.1. Analysis for the symmetric positive definite case. Since the symmetric posi- 
tive definite case is rather simple, we consider it here. Throughout this subsection we assume 
that &(•,■) = «(*,*)• The ful1 abstract theory of Dryja and Widlund, [13], on the optimal 
convergence of the additive Schwarz methods cannot be used directly because our subprob 
lem operators Pij are not defined as projections. We summarize the results of the symmetric 
case in the following theorem. 

THEOREM 1. There exist positive constants c and C\ independent of the the mesh 
parameters h and H } such that 

c||u||^<a(P U ,u)<C|H| ! „, 

for any u € U 

One of the key facts that we shall use in the proof of the theorem is given in the following 
lemma due to Dryja and Widlund [13]. 

Lemma 3.1 (Dryja and W1DLUND[13]). There exist positive constants c and C, 
independent of the mesh parameters, such that 

c\\u\\l<h\TM\l<C\\u\\l 

i=0 

for any u 6 Vh ■ 

We remark again that since both P tJ and P,; are included in the definition of P , the 
operator P is symmetric with respect to &(-,-)• The upper bound of the operator P can be 
obtained easily, since 

n«<«ii2 < cum: + iMi:i 

and 

\\Pu\\ 2 a<C^2\\PM\l 

By using Lemma 3.1, we obtain that ||Pti|| a < C|M| 0 , for any u € V h and where C > 0 is a 
constant independent of the mesh parameters. To obtain the lower bound, we note that 

(7) a(PijU,u) = \\T,u\\ 2 a + ||I>|la - a(Tiu,Tju). 

Using the fact that a{T{U,Tjv) < ||2;M|| a ||7VulU < l/2(!|T,«||a + l|7>lla), we have 

«<«,«,«) > 1 (IMJ + ||r,u|i;) . 

This gives the lower bound when combined with Lemma 3.1. 

We remark that since the operator P is symmetric and positive definite with respect to 
the inner product a(-,-), the conjugate gradient method can be used. It is obvious that the 
degree of parallelism of the new method is higher than that of the symmetrized multiplicative 
Schwarz algorithms. Here we have considered only Poisson’s equation; the extension of 
the algorithm and theory to general variable coefficient symmetric positive definite cases is 
straightforward. 

The analysis for the symmetric case is included above for theoretical interest. In practice, 
however, we do not believe that this type of upwinding preconditioning would offer much 
improvement over the classical additive Schwarz method for symmetric positive definite 
problems. Some numerical examples are included in the last section of the paper. 
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3.2. Analysis for the general nonsymmetric case. We consider the general non- 
symmetric case in this subsection. The techniques are mainly borrowed from Cai and Wid- 
lund [5, 6]. Let us begin by summarizing the main results, namely that the operator P 
is uniformly bounded and its symmetric part, with respect to the inner product a(-,-), is 
uniformly positive definite, in the following theorem. This theorem provides the optimal 
convergence of several Krylov space iterative methods, including GCR [16] and GMRES [21] 
among others. 

THEOREM 2. There exist positive constants Ho, c(H 0 ) and C, independent of the mesh 
parameters h and H, such that if H < Ho, the operator P is uniformly hounded, i.e., 

\\Pu\\a < c\\u\\ a , Vu e v h , 

and its symmetric part is uniformly positive definite, i.e., 

a(Pu,u) > c||it||2, Vu € V h . 


To prove the above theorem, we need a result from Cai and Widlund [6] regarding the 
optimality of the additive Schwarz preconditioner. 

LEMMA 3.2 (Cai and Widlund[6]). There exist positive constants H 0> c{H 0 ) and C, 
independent of the mesh parameters, such that if H < Ho 

||£T t u|| a <c|M| 0 , vuev h , 

;=o 


and 

2 \\ T i U \\l > C IMIa> Vu € V h . 

t=0 

We next present a number of useful lemmas before giving the proof of the main theorem 
later in this subsection. The following lemma says that the symmetric part of T, is positive 
definite if the size of the subdomains, i.e. H, is sufficiently small. The proof is relatively 
simple, and therefore not included. The constant C appearing in the lemma depends on the 
coefficients 0(x) and c(x) of the elliptic operator L. 

LEMMA 3.3. There exists a positive constant C , independent of the mesh parameters, 
such that 


a(u,T,u) > (1 - CH)\\T,ut - 

for any i, and u 6 Vh- 

The contribution from the first and zeroth order terms of the elliptic operator L is 
estimated in the next lemma. We prove that the contribution is of lower order in H. 

LEMMA 3.4. There exists a positive constant C , independent of the mesh parameters h 
and H , such that for any i,j 0 for which fl, and Clj overlap 
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1. s{TiU,TjU ) < CH (\\T t u\\l 4- \\T-ju\\ 2 a ) 

2. s(T,T 3 u, Tiu) < CH(\\Tiu\\l + \\T jU \\l) 

3. siu.TiTju) < CH (\\T jU \\l + 

for all u € V h . The same estimates hold if the bilinear form s(-,-) is replaced by the bilinear 
form c(-. •). 

We leave the proof of this lemma to the interested reader. The basic idea of the proof is to 
use that HTjuH^njj < CH\\T,u\\ ain > r for any 0. As in the previous lemma, the constants 
C depend on the coefficients /3(x) and c(x ) of the elliptic operator L. Using Lemmas 3.3 
and 3.4, we now proceed to give a lower bound of the two-subdomain multiplicative Schwarz 
operator p* 

Lemma 3.5. There exists a positive constant C, independent of the mesh parameters h 
and H, such that for any i,j for which and Q, J overlap 

a(P,,u,u) > (i - CH) \\TMW + (\ - CH) ||ZjiC - CH\\u\\l in , y 

for any u € Vh * 

Proof. We first note, by using the definition of the operators T t and Tj and the fact that 
6(-, -) = a(-, •) + -s(-, •) + c(>, •), that 

a(PijU,u) = a(T t u,u) + a(TjU,u) -a(TiTjU,u) 

= a{TiU, u ) + a(TjU, u ) - a(T,u, T 3 u) + 

s(T t u , TjU) - c(T iU , 7» + 2s(T i T J m T t u) + 
s(u, TiTju) + c(u, T{Tju). 

The desired proof follows by using Lemmas 3.3 and 3.4. □ 

We are now ready to prove the main theorem of this subsection. The upper bound is 

easy. It can be seen that 

Pa =Ti + (I - Ti)Tj. 

By using the fact that (I - T t ) is uniformly bounded, we obtain 

The upper bound of P can then be obtained by summing the above estimate for all possible 
pairs of subdomains and using Lemma 3.2. To establish the lower bound, we sum the 
estimate in Lemma 3.5 and use the lower bound part of Lemma 3.2, and the assumption 
that H is sufficiently small. 
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Fig. 1. The term TiTj is kept in the Schwarz polynomial only if the flow goes from Qj to 

3*3. A weighted local multiplicative algorithm. In this subsection, we introduce a 
variant of the local multiplicative algorithm that is particularly useful for fluid flow problems. 
The basic philosophy is the same as in the design of any upwinding type discretization 
schemes. We first note that the operator P has the following, more explicit, form 

(8) p= E r.- E TiTj ■ 

0 <i<N 1 <i?j<N 

In other words, P is equal to the regular two-level additive Schwarz operator plus some sec- 
ond order perturbation terms. Since the additional second order terms enhance the nearest 
neighbor communication, we therefore believe they will make the overall convergence faster 
for the classical additive Schwarz algorithms. This observation will be confirmed by a num- 
ber of numerical experiments in the next section. Borrowing a term from the Streamline 
Upwind Petrov-Galerkin (SUPG) methods [18, 20], the second order terms TiTj, if used prop- 
erly, “stabilize” the preconditioner when solving convection-diffusion equations. The SUPG 
method also suggests the following version of the algorithm with weights in the upwinding 
directions. Let 

(9) t= £ T- £ 

0 <i<N 1 <i^j<N 

Here wj equals zero or where 0 < // < 1.0 is a constant. The choice of depends on 

the direction of the flow. The intuition is that if the flow goes from to and if these 

two subdomains are neighbors, then we set /j,j to be a positive constant //, and to zero. 
We have not exploited the possibility of using different fjtij for different pairs of subdomains. 
Of course, if and ft’- are not neighbors we then set fi tJ = fi Jt = 0. The motivation here 
is exactly the same as in using the upwinding techniques in the solution of problems that 
involve hyperbolic components. A difference is that the usual upwinding techniques are 
used only at the discretization level, and our “upwinding” is introduced as a way to define 
the preconditioning polynomial. It is understandable that, for problems that have a strong 
characteristic direction, such as convection-diffusion problems, some kind of upwinding can 
speed up the convergence. 

We now propose a heuristic method to be used to determine the flow direction. Let 
f3(x) = (&i(x),. . .,6d(a;)) T be the characteristic vector of the flow. For each pair of neigh- 
boring subdomains, we choose a curve, such as IY, in Fig. 1 or another curve in 
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that more or less separates the subdomains. Since we are defining preconditioners, it is not 
necessary to find the precise separating curve. Let n,j, defined on T.-j, be the unit vector 
pointing from subdomain f lj to fl,. We define the parameters ft tJ by looking at the sign of 
a line integral 



{ 0 


if J fl{x) ■ riijds > 0 
otherwise, 


where the integral is taken along the curve r,j. 

4. Numerical experiments. In this section, we present some experimental results to 
numerically understand the local multiplicative Schwarz algorithms, and to compare them 
with the classical additive and multiplicative Schwarz algorithms for both symmetric pos- 
itive definite and nonsymmetric problems. Although the proposed methods belong to the 
class of optimal preconditioners, some effort is needed to obtain the best performance for 
a particular test problem, especially in the selection of the parameter /i in both symmetric 
and nonsvmmetric cases. We note that ju = 1.0 is usually not a good choice. As mentioned 
earlier, our optimal convergence theory requires that the coarse grid is sufficiently fine, how- 
ever, in practice, especially in the nonsymmetric cases, it is quite difficult to find a coarse 
grid of proper size such that the convergence is not slower than the purely local (i.e., without 
a coarse space) Schwarz algorithms. 

We consider the following model problem on the unit square 

f Lu = f in fl = (0, 1) x (0, 1), 

| u = 0 on dO. 

The right-hand side / is always chosen such that the exact solution is u = xe Ty $in(irx)sin(iry). 
The coefficients of L will be specified later for each test problem. We use an 256 x 256 uni- 
form fine mesh throughout this section. The number of subdomains is 64 in all test cases, 
i.e., we use an 8 x 8 uniform partitioning of the domain into subdomains, with a uniform 2 h 
overlap between each neighboring subdomains, where h = 1/256. In our experiments, the 
coarse grid linear system and all the subdomain linear systems are solved exactly by using a 
sparse linear system solver from the Argonne National Laboratory software package PETSc 
of Gropp and Smith [19]. All the Schwarz methods are used as left preconditioners for the 
CG method, or the non-restarted GMRES method, with a zero initial guess. We stop the 
CG or GMRES iteration as soon as the preconditioned initial residual is reduced by a factor 
of 10 -5 . We discretize the PDE at both the fine and the coarse levels by the usual five-point 
central, or upwinding, finite difference method. 

Example 0. We first test the algorithms on a simple Poisson’s equation. (This is not 
what the new algorithm is designed for.) In Fig. 2, we show that the new algorithm is slower 
than the multiplicative Schwarz algorithm, but with parameter /.i = 0.3, faster than the 
additive Schwarz algorithm. Without using a proper \i , the algorithm can be slow. An 8 x 8 
coarse solve is included in all cases. The multiplicative Schwarz algorithm is symmetrized in 
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Fig. 2. The curves show the iteration history of the additive, multiplicative and the local multiplicative 
Schwarz preconditioned CG methods. The solid curve represents the local multiplicative Schwarz method , 
the dashed curve represents the additive Schwarz and the broken curve represents the multiplicative Schwarz 
method. 

order to be able to use CG. We remark again that even though the symmetrized multiplicative 
Schwarz is the fastest among the three algorithms, it has the lowest parallelism. The per-step 
arithmetic cost of the new algorithm is higher due the repetition of the subdomain solves. 

Example 1. We let Lu = —V • (Vu) + V • (/?«), where ft — (6j, 6 2 ) is a constant vector 
with = 100.0, or —100.0. We discretize the PDE with the usual five-point central finite 
difference method. We first compare the new method, with (x = 0.5, with the additive and 
multiplicative Schwarz methods without coarse space in the case ft = (100,100). For the 
multiplicative Schwarz, we order the subdomains by the natural ordering. No coloring is 
incorporated in the implementation. The results are presented in the left figure of Fig. 3. It 
can be seen clearly that , for ft = (100, 100), the multiplicative Schwarz method is the fastest 
of the three. However, the situation changes, if we let ft = (—100, —100) and do not change 
the subdomain ordering in the multiplicative Schwarz method. As shown in the left figure 
of Fig. 4, the new method becomes the fastest of the three. Apparently, the changing of the 
flow characteristics hurts the convergence of the multiplicative Schwarz algorithm, but the 
new method does not suffer. 

We next present cases when coarse spaces are included in the preconditioners. The 
optimal convergence theory for all three Schwarz algorithms requires that the coarse grid 
is sufficiently fine. Our numerical experiments suggest that they in fact need coarse grid 
of different sizes, i.e., a sufficiently fine coarse grid for one Schwarz method may not be 
sufficiently fine for the others. We say a coarse space is “good” if the total number of 
iterations is smaller than without it. A coarse grid, not sufficiently fine, usually leads to a 
slower convergence in all Schwarz type methods. In the right figure of Fig. 3, we present 
three Schwarz algorithms with three different coarse grid sizes, namely the multiplicative 
Schwarz with an 16 x 16 coarse grid; the additive Schwarz with an 32 x 32 coarse grid; the 
new method with an 64 x 64 coarse grid, and p = 0.5. Comparing the right figures in Fig. 3 
and Fig. 4, we observe that the multiplicative Schwarz method with a coarse space of proper 
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Fig. 3. Central finite difference discretization with 0 = (100,100). The left figure shows cases without 
coarse spaces, and the right figure shows cases with coarse spaces. The curves show the iteration history 
of the additive (dashed), multiplicative (broken) and the local multiplicative (solid) Schwarz preconditioned 
GMRES methods. 



Fig. 4. Central finite difference discretization with 0 = (-100, -100). The left figure shows cases without 
coarse spaces , and the right figure shows cases with coarse spaces. The curves show the iteration history of the 
additive (dashed), multiplicative (broken) and the local multiplicative (solid) Schwarz preconditioned GMRES 
methods . 

size is always the best of the three. 

Example 2. We let Lu = -V • (Vw) + V • (0u), where 0 = (61,62) is a constant, 
vector with b u b 2 = 1000.0 or -1000.0. The equation is discretized by the usual five-point 
upwinding finite difference method. We run the test code without using coarse spaces for four 
different constant flow directions. As before, for the multiplicative Schwarz preconditioner, 
we order the subdomains in the natural ordering. No coloring is assumed. For the new 
algorithm we use p = 0.7. The residual history is presented in Fig. 5. It is clear that if 
the subdomain ordering does not follow the flow characteristic direction the convergence of 
multiplicative Schwarz becomes significantly worse than in a case when the ordering follows 
the flow. Additive Schwarz is not sensitive at all to such an ordering, but is quite slow. The 
new algorithm does not need any special attention to the ordering, and converges faster than 
(a) the additive Schwarz algorithm in all four cases; (b) the worst case of the multiplicative 
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Fig. 5. The figures show the iteration history of the additive (upper left), multiplicative (upper right ) and 
the new (lower left) Schwarz preconditioned GMRES method. The line types correspond to the flow directions, 
i.e., solid lines (3 = (—1000, —1000), dashed lines 8 = (1000, 1000), dotted lines 8 = (1000,-1000) and broken 
lines 8 — (—1000, 1000). 

Schwarz algorithm. 

Our experience suggests that it is by no means easy to find a coarse space of proper 
size in the case that the PDE is discretized by upwinding finite difference methods. Further 
theoretical and numerical investigation of this situation is underway. 

5. Conclusion. In this paper, we introduce a new class of overlapping domain decom- 
position methods for solving scalar convection-diffusion problems. The method improves 
the classical multiplicative Schwarz methods by reducing their sensitivity with respect to 
the flow direction. For the Galerkin finite element discretization, we prove that the method 
is optimal in the sense that the convergence rate is independent of the mesh size and also 
the number of subdomains in both R 2 and R 3 . Numerical experiments are also reported to 
illustrate the rankings of the methods and some open questions are identified. 

Acknowledgement. The authors are indebted to D. Keyes and 0. Widlund for reading 
a draft of the paper and also many helpful discussions. 
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